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Machine Learning 


Clustering 


Supervised learning 
introduction 




Given labeled training set: 

{(x (1) , y (1) ), (x (2) , y (2) ), (x (3) , y (3) ), 


(x( m ) , y( m ) ) } 



Unsupervised learning 



Clustering algorithm: 
group these data into two 
different clusters. 


Given unlabeled training set:{a? (1) ,:r (2 \a;( 3 \. . . ,x^} 



Market segmentation 



Organize computing clusters 



Social net wo 


Astronomical data analysis 





Machine Learning 


Clustering 

K-means 

algorithm 



>The K Means algorithm is by far the most widely used clustering algorithm. 

>K Means is an iterative algorithm and it does two things. First is a cluster 
assignment step, and second is a move centroid step. 
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>In the cluster assignment step, the algorithm goes through each of the training 
examples, and depending on whether it's closer to which cluster centroid, it is 
going to assign each of the data points to one of the two cluster centroids. 
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entroid step, the algorithm is going to move the two cluster 
: average of the points of the same cluster. 
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running additional iterations of K means from here the cluster 
not change any further, so at this point, K means has converged 


K-means algorithm 
Input: 

- K (number of clusters) 

- Training set{x (1) , , . . . , 


x (m)} 


x ( { ) e R n (drop x 0 = 1 convention) 


-means algorithm 


Randomly initialize K cluster centroids Hi 2 , ■■■ , I^k € R n 
Repeat { 

for i = 1 to m 

cW := index (from 1 to K ) of cluster centroid 

closest toa;^ min ||®^ — M k || 2 

forfc = 1 to K k 

l^k := average (mean) of points assigned to cluster^ 


} 



Height 



Machine Learning 


Clustering 

Optimization 

objective 



!ans optimization objective 

cW = index of cluster (1,2, ...,K) to which example is currently 
assigned 

l^k = cluster centroid k ( fik e M n ) 

ju c «) = cluster centroid of cluster to which example has been 
assigned 

Optimization objective: 
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J(c (1) , . . . Ml, , /Xtf) 


.(m) 


fl\ 5 ... 5 l^K 


-means algorithm 

Randomly initialize K cluster centroids i'l , ^2? • • • , M k G IP- 


Repeat { 

fori = 1 tom 

c (*) ;= index (from 1 to K ) of cluster centroid 
closest tox^ 
for k = 1 to K 

Hk := average (mean) of points assigned to 


cluster 

} 


Random initialization 

> Should have K < m 

> Randomly pick A" training 
examples. 

>Set , • • • , Hk equal to these A" 
examples. 

> By these two illustrations on the 
right. You might really guess that I<- 
means can end up converging to 
different solutions depending on 
exactly how the clusters were 
initialized. 



Random initialization 
For i = 1 to 100 { 


Randomly initialize K-means. 

Run K-means. Get , c^ m \ fj, i, 

Compute cost function 

J(C (1) , . . . ,C (m) ,yU i, . . ,,Hk) 

} 



• , Hk- 


Pick clustering that gave lowest cost J(c^\ 


• • • 


, c (m) ,^i, • • • ) /J'k) 
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Clustering 


Choosing the 
mber of clusters 



What is the right value 



Cost functionj 


Choosing the value oi 

Elbow method: 



K (no. of clusters) 


Cost functionj 



K (no. of clusters) 



the value oi 


Sometimes, you’re running K-means to get clusters to use for some 
later/downstream purpose. Evaluate K-means based on a metric for 
how well it performs for that later purpose. 
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